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ABSTRACT 



Web pages for commercial applications, such as electronic 
retail, are built "on-the-fly" in Hypertext Markup Language 
(HTML) from product data stored in the merchant's data- 
base. To reduce costs in time and computing resources and 
to improve customer access to data from the merchant *s web 
site, pages created in HTML are cached on the merchant 
server. On a customer request for a page, the merchant server 
checks the cache first for the page, and if it isn't found there, 
generates a new page from the database. To maintain the 
validity of the content of the cached pages, the database 
tables include triggers that cause identifying information for 
any changes made on the stored data to be forwarded to a 
cache log. A synchronization daemon walks the cache log 
from time to time to locate pages that should be purged from 
the cache because their content is no longer synchronous 
with the data stored in the database. By setting preferences 
for identifiable customers ia advance, data generated from 
the database can be selected for specific customer groups. 

14 Claims, 4 Drawing Sheets 



100 

RECEIVE URL lEOUEST 



SRRVFJl CALLS 
CUSTOMEEDAPl 
EXrCNSKJN 



SKARCH CACHE 
DIRECTORY 




OUERY DATABASE TO 
BUILD HTML 
DOCUMENT 



117 

nORE RLE IN CACHE 

DaBCTorv 



04/09/2004, EAST Version: 1.4.1 



U.S. Patent Feb. 6, 2001 sheet 1 of 4 US 6,185,608 Bl 



100 

RECEIVE URL REQUEST 



FIGURE 1 



102 

SERVER CALLS 
CUSTOMIZED API 
EXTENSION 



104 

SEARCH CACHE 
DIRECTORY 




NO 



I 

110 

QUERY DATABASE TO 
BUILD HTML 
DOCUMENT 



112 

STORE FILE IN CACHE 
DIRECTORY 



04/09/2004, EAST Version: 1.4.1 



U.S. Patent Feb. 6, 2001 sheet 2 of 4 US 6,185,608 Bl 



200 

SET DESIRED 
PARAMETERS IN 
CONFIGURATION FILE 



202 
HTML FILE 
GENERATED FROM 
DATABASE 



108 

SEND FILE TO USER'S 
BROWSER 



204 

STORE FILE IN CACHE 
DIRECTORY BY 
PARAMETER NAME 
AND VALUE 



FIGURE 2 



04/09/2004, EAST Version: 1.4.1 



U.S. Patent Feb. 6, 2001 sheet 3 of 4 US 6,185,608 Bl 



FIGURE 3 



304 CACHE 



306 



ms: 



306 



306 



T" 

l\ 



2 / 






300 RELATIONAL DATABASE 






302 




302 




302 











































































































































04/09/2004, EAST Version: 1.4.1 



U.S. Patent Feb. 6, 2001 



Sheet 4 of 4 



US 6,185,608 Bl 



400 

CHANGE IN DATABASE 
DESIGNATED TABLE 



FIGURE 4 




YES 



404 
STORE PAGE 
IDENTIFIER IN 
CACHLOG 



406 

DAEMON QUERIES 
CACHLOG 




YES 

V. 

410 

PURGE CACHE 
RECORDS FOR 
IDENTIFIED PAGE(S) 



04/09/2004, EAST Version: 1.4.1 



us 6,11 

1 

CACHING DYNAMIC WEB PAGES 

HELD OF THE INVENTION 

This invention relates to improvements in presenting data 
over the Internet, and in particular, provides a mechanism 
for dynamically caching and validating web pages generated 
horn data stored in a database. 

BACKGROUND OF THE INVENTION 

The Internet is a vast computer network consisting of 
many smaller networks spanning the globe. It is well known 
"lore" that the Internet was started in the late 1960*s as 
development project of the U.S. Department of Defense to 
provide a back-up communications system that would be 
virtually impossible to destroy in the event of a major 
catastrophe. The Internet has grown exponentially, and rnil- 
lions of private users and corporations now use it daily for 
all kinds of communications needs. 

The World Wide Web (WWW) was developed in 1991 as 
a information system running over the Internet. The WWW 
is based on the concept of "hypertext" and a transfer method 
known as HTTP (Hypertext Transfer Protocol). HTTP is 
designed to run primarily over TCP/IT (Transmission Con- 
trol Protocol/Internet Protocol), a networking protocol that 
permits use of the Internet. One increasing use of the WWW 
is commercial — ^with recent improvements in secure trans- 
actions as well as graphical presentation, merchants can 
display and sell their goods and services over the Internet. 

One format for information transfer over the WWW is to 
create documents using Hypertext Markup Language 
(HTML), a programming language that supports naviga- 
tional hnking ("hypertext links"). HTML is a structured 
language, based on SGML (Standard Generalized Markup 
Language), a document processing system. Like SGML, 
HTML describes the structure of the document through a 
system of tags; HTML pages are made up of standard text 
as well as formatting codes for headings, paragraphs, lists, 
tables and character styles, that indicate how the page should 
be displayed. HTML includes a tag called a "hnk tag** that 
provides the programming for nonlinear navigational links. 
One example of the use of HTML pages with navigational 
links in the context of business documents is described in 
U.S. Pat. No, 5,692,073 to Xerox Corporation for "Formless 
Forms and Paper Web Using a Reference-Based Mark 
Extracting Technique". 

The WWW makes use of Uniform Resource Locator 
(URL) to define the address of a particular page on the 
Internet. The URL naming system consists of three parts: the 
transfer format (often "http") followed by a colon and two 
forward slashes (://), the name of the host machine that holds 
the file, and finally, the path to the file on the host machine. 
In a typical piece of hypertext, the data stored in the 
hypertext link is a label pointing to a remote destination. 
ThLs Ls programmed in HTML by embedding the address of 
the link destination, the URL, in the link lag. 

When a client accesses a web page, it does so through a 
software program called a browser which estabhshes the 
connection with the server hosting the page. The server 
executes corresponding server software which presents 
information to the chent in a transfer format (eg., http) 
response corresponding with the web page or other data 
generated by the server. As the web page is initialized on the 
client machine, the browser renders the text and graphics for 
it from the HTML data. 

While HTML is used to dehver data on the web, most of 
the underlying information is not stored in HTML, but in 
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other, richer storage formats, such as SGML and legacy 
systems such as databases. The data in these other formats 
must often be converted to HTML dynamically. Methods for 
converting files fi-om SGML to HTML, including adding 

5 "anchors" or navigational links referencing other files during 
the conversion, are discussed in U.S. Pat, No. 5,530,852 of 
Sun Microsystems, Inc., titled "Method for Extracting Pro- 
files and Topics from a First File Written in a First Markup 
Language and Generating Files in Different Markup Lan- 

jo g^^g^ Containing the Profiles and Topics for use in Access- 
ing Data and Described by the Profiles and Topics", and in 
"HTML makes a great delivery vehicle for Web-based 
information. It just isn't a sensible place for much of that 
information to live in." by R, Light, Archives and Museums 
Informatics, vol. 9, no. 4, pp. 381-387, 1995. 

In a commercial web site, a store sells its products to 
potentially millions of customers on the Internet by display- 
ing the products through HTML documents. It is common 
that a merchant may have thousands of products in its 

2Q catalog to sell. It is tedious, error prone and nearly impos- 
sible to manually create and manage the static HTML 
documents for navigating to and displaying these large 
number of products. 

A merchant server system helps merchant manage the 

25 catalog data and provides the support to sell products on the 
merchant's web site. In a merchant server system, the 
merchants catalog data are commonly stored in a relational 
database. There are database tables for storing product 
information, tables for grouping related products together 

30 into category and related categories together into higher 
level category, and tables for storing category information. 
When a shopper goes to the merchant's web site from his 
browser, the merchant server accesses the data in the data- 
base through a strucUired query (SQL) and dynamically 

35 generates HTML documents to show the category and 
product pages as the shopper navigates through the mer- 
chant's store. For example, U.S. Pat. No. 5,692,181 of NCR 
Corporation for "System and Method for Generating 
Reports fi:om a Computer Database" discusses the problems 

40 associated with organizing interrelated data in database 
tables, and generating customized HTML documents, in this 
case, reports, from data stored in relational databases. 

In an electronic retail situation, a shopper usually enters 
the web site for a department store, for example, at the 

45 store's home page. From the home page, the shopper can 
click on a link to visit a top level category such as the Men's 
Wear department. From the Men's Wear page, he can choose 
the Pant section among other links to second level categories 
on the page. As the shopper navigates down the category 

50 hierarchy, he reaches a product page that shows a dress pant 
of a certain brand and the available sizes and colors. He can 
now pick the size and color he wants, and order the pant. The 
merchant server will take him through the ordering pages 
where he can provide the payment and shipping information. 

55 When the ordering steps arc done, the order information will 
be recorded in the database and the merchant will be able to 
use this information to fulfill the order later. 

While dynamically generating the category and product 
pages are desirable so that the merchant only needs to 

60 manage the catalog information in the database, it takes up 
processing cycles in the merchant server to access the 
database and dynamically create the HTML pages the shop- 
per wants to see. If the web site receives heavy Irafl&c, this 
can significantly slow the shopping experience. A category 

65 or product page is the same one whether it is generated the 
first time or subsequent times until the corresponding cata- 
log data in the database is changed. 
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It can significaatly reduce the load on the merchant server FIG, 1 is a flow diagram ilhistrating the steps for obtain- 

and improve the system performance if the generated pages ing a page from a merchant server for display by a shopper's 

can be saved for subsequeat access and are re-generated browser, 

only when the corresponding catalog data is changed. The fIg. 2 is a flow diagram illustrating a method for creating 

shoppers will sec a much better response time mnavigatmg s cache files based on ajedal oarameters- 

through the category and product pages because the pages . v » 

are readily displayable from flie web site once they have ^ ^ ^ schematic diagram showing elements of the 

been "cached". preferred embodiment of the invention; and 

However, one problem for the merchant server is being FIG. 4 is a flow diagram illustrating a method for main- 
able to maintain the validity of the cached pages automati- taining validity of the cache when updates to the data stored 
cally so that the caching function becomes completely in the database have been made, 
transparent to the merchant, who will manage the catalog 

data as usual. That is, when the data in the database used for DETAILED DESCRIPTION OF THE 

cached pages is changed, it would be preferable if the PREFERRED EMBODIMENTS 

merchant server was able to purge invalid cache paces ^ • • ^ •i.j-i. 

automatically and re-generate new ones as they are needed. T^f m f ^ the context of fiinction 

provided by the Nct.Commerce product of International 
SUMMARY OF THE INVENTION Business Machines Corporation. This product enables mer- 
it is an object the present invention to address the design chants to develop electronic sales channels of the type 
of caching the dynamicaUy generated pages for future use described above. However, as wiU be appreciated by the 
while maintaining the validity of the cached pages. 20 person skilled in the art, the concept of the invention is 
Accordingly, the present invention provides a document applicable to similar systems that perform dynamic genera- 
processing system for transmitting data for display on a tion of HTML pages by accessing data in a database, 
client machine from a server. The system consists of data In Net.Commerce, there are two command URLs to 
storage connected to the server, a converter program in the display category pages and product pages respectively. The 
server for transforming data from the data storage into 25 former is ;display/category and the latter is ; display/item, 
transmissible form, such as HTML, for display on the client The category command takes two parameters, one is the 
machine, a cache on the server for storing one or more category reference number and another is the merchant 
copies of the transformed data in transmissible form, and reference number. Similarly, the product command takes 
means in the server for checking the cache for a copy of the two parameters, one is the product reference number and 
transformed data in transmissible form before activating the 30 another is the merchant reference number. A reference 
converter program on receiving a request for data transmis- number in Net.Commerce is a primary key in a database 
sioa from the client. Preferably, the data storage is a database table. A category reference number uniquely qualifies which 
which includes a trigger mechanism to notify the server of category to display, and a product reference number for 
a change to the stored data. The server would include a which product to display. 

synchronizer adapted to purge from the cache copies of the 35 ^ shown in FIG. 1, cache pages are created on demand, 

transformed data affected by the change to the stored data. This means that they are not stored into the file system until 

Preferably, also, the converter program includes means for requested. With the help of the caching function in 

querying the cUent's identity, locating preferences corre- Net.Commerce, each time a user requests a product or 

sponding to the client's identity and selecting data from the category page, upon receiving the URL request (block 100), 

data storage according to the located preferences. 4Q ^^b server calls a customized API extension provided by 

According to another aspect, the present invention pro- NetCommerce to search a cache directory (the location has 

vides a method for maintaining a valid cache of data been configured during installation) for the requested file 

generated in displayable form from a computer data storage. (blocks 102, 104). If the file exists in the cache, it is 

The method is executed in a computer by storing in cache at inmiediately sent to the user's browser (blocks 106, 108). If 

least one copy of data generated from the data storage in 45 gig jg not in the cache, it must be generated dynamically 

displayable form. On receiving a request for transmission of in the usual way. The database is queried to build the HTML 

data in displayable form, the request is compared with the document (blocks 106, HO). It is then returned to the user's 

data in the cache. If a match is found, the data is transmitted browser (block 108), and is also captured and stored in a file 

from the cache. Otherwise, a copy of daU from the data in the cache directory, where it will be available the next 

storage is generated in displayable form for transmission. 50 lime requested (block 112). 

Also, on receiving notification of a change to data in the data p^ie names created in the cache will typicaUy look like 

storage, the notification is compared with the data in the j^ig. 

cache and, if matched, is purged 6om the cache. cgmenbrl_cgrfnbr5_.ncibm 

The invention also provides a computer implemented rmenbrl rrfnbrS ncibm 

method for generating data in displayable form from a L a^ uu" * jl *i_ i. r.. 

„,,7. f • These files would be created by cachmg the results of the 

computer data storage accordmc to user preferences in , / & 

commancis 

response to a client request. The method consists of identi- , 

fying the cUent, matching the client's identity with a pre- ;display/category7cgmenbrol&cgrfribro5 and 

determined preferences file, selecting data from the data ;display/ilem?prmenbrol&prrfnbr=8, respectively, 

storage according to the preferences file and generating the ^0 1° above examples, "5" is the value of the category 

selected data in displayable form for transmission to the reference number (cgrfnbr) in the first command and "8" is 

client. the value of the product reference number (pnfnbr) in the 

^^r^^ . ^ ^ second. In both cases, the merchant reference number is "1". 

BRIEF DESCRIPTION OF THE DRAWINGS gy default, files are only cached based on product or 

Embodiments of the invention will now be described in 65 category reference numbers. Anyone requesting a product or 

detail in association with the accompanying drawings, in category page using the display command would receive the 

which: same file from the cache. However, the present invention 
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provides a means to display different product or category The records in CACHLOG are created as illustrated in 

pages based on parameters other than the products or FIGS. 3 and 4 and described below, 

categories themselves, and this is illustrated in FIG. 2 and As shown in FIG. 3, the Net.Commerce product provides 

discussed below. The additional parameters are termed as web page delivery for product information contained in a 

significance here. ^ ^ ^ 5 number of tables 302 in a relational database 300. The 

The way that files are cached can be customized by setUng CACHLOG table 308 is for identifying what cache pages 

Significances m a configuration file (block 200). To do 306 need to be purged from cache 304 as a result of changes 

^s significances on the foUowmg hnes can be entered in ^ ^^^^ ^^^^^^^ ^^^^^ t^c database 300. 

the configuration file, separated by commas, using the r«p ™ *u * u u u j • . 

following syntax* » & Information that a change has been made is propagated by 

.NC_o,,._SIG_PARMS parameter_name(s) (if the '° ^'^^f °" the database tables^ In the piefeired 

p^eter is for the display category commaiid) embodiment, triggers are mstaUed on the foUowing product- 

Kir" TTcx* cm nATtxio * / \ / c *i_ category-related tables: 

NC_ITEM_SIG_PARMS parameter_name(s) (if the vnnixtiArr.u a * * ui \ 

parameter is for the display product command) PRODUCT (the product table) 

where "parameter_j3ame" is the name of an additional PRODPRCS (the product price table) 

parameter passed to the display command. PRODATR (the product attribute table) 

Significances affect the file names of the cached files, PRODDSTATR (the product distinct attribute table) 

refining the way they can be displayed, and they are passed PRODSGP (the product shopper group template table) 

to the cachingutility as display commands The significance CATEGORY (the category table) 

causes the HTML files generated by the display command r^i-nvnr^T * 1 ^- l- . 1.1 x 

(blocks 202, 108) to be stored in the cache using file names 20 CGRYREL (the category relationship table) 

that contain the parameter name and its value on the com- CGPRREL (the category product relationship table) 

mand (block 204). When the user requests a page containing CATESGP (the category shopper group template table) 

those parameters, the caching utility will now be able to A record 310Z> in the CACHLOG table 308 contains a 

distinguish that page in the cache by its file name following name -value pair which identifies one or more cached pages 

the method described above and illustrated in FIG. 1. 25 that need to be purged. The same name-value pair was used 

An example to illustrate the use of significance follows. A to generate the file names 310fl of these pages in the ;display 

merchant needs to display unique product pages to members commands previously. 

of different shopper groups in its store. A Shopper Group For example, a page resulted from the command ;display/ 

table is provided in the Net.Commerce database for storing item?prrfhbr=123&prmenbr=2 is cached with a file name 

the shopper group information. In the present example, there 30 containing the name-value pair "prrfnbr" and "123". As 

three shopper groups in the table under the names "Gold", shown in FIG. 4, when a database record associated with the 

"Silver" and "Platinum". Their shopper group reference productof product reference number equal "123" is changed 

numbers arc "1", "2" and "3", respectively. Because cat- (block 400) for a page previously generated in HTML (block 

egory and product pages with different contents wiU be 402), a record having the name-value pair "prrfnbr'* and 

dynamically created for different shopper groups, a signifi- 35 "123" will be created in the CACHLOG table by the 

cance is added to the configuration file to distinguish the corresponding database trigger (block 406). 

different shopper groups, so that different files will be stored The synchronization daemon periodically queries the 

in the cache for different shopper groups. The following line CACHLOG table to determine whether any new log records 

is added to the configuration file: have been added (block 408), and purges cache files which 

NC_ITEM_SIG_PARMS sgrfnbr 40 may be affected by the changes in the database (block 410). 

where "sgrfnbr" is the parameter to differentiate the shopper In the example above, the daemon will purge all pages 

groups. having the name-value pair "prrfnbr" and "123" on their file 

Next, the parameter "sgrfnbr" is added to the ;display/ names. In fact, in the preferred embodiment, the synchro - 

item command. For example, the command ;display/ nization daemon purges more pages in order to maintain 

item?prrffnbr-10&prmenbr-l&sgrfnbr-3 requests a page 45 cache validity. In this example, all cache pages of the 

for a member of the third shopper group, the "Platinum" categories to which the product "123" is belonged will be 

group. It passes parameter for the shopper group in addition purged. 

to the default product and merchant reference numbers. The merchant may need to create a custom trigger if a 

When a significance such as sgrfnbr in this example is custom table has been created from which information is 

added, the cached file name would appear as: 50 retrieved to create product or category pages. For example, 

prmenbrl_prr&ibrl0_sgrfnbr3_.ncibm the merchant may create a table, PRODEXTINFO, that 

where "10" is the product reference number and "3" is the contains extra text infonmation about products to be included 

value of the name/value pair (sgrfiibr«3) in the ;disp lay/item in the displayed pages. The table contains a column, 

command. PEPRNBR, that is a foreign key to the product reference 

By adding the significance, file names are created in the ss number, and another column, PETEXT, that contains the 

cache that the caching utility will recognize. A separate file text itseff. Because column PETEXT is selected in an SQL 

will be cached, and can therefore be served, based on each query when generating the product page, a cache file created 

significance, from information retrieved must be purged when the 

To maintain the validity of the cache files, a synchro ni- PETEXT value for a product has changed. If the merchant 

zation daemon, a housekeeping or maintenance utility, in 60 server updates PETEXT in the record with PEPRNBR equal 

Net.Commerce automatically handles file purging by delet- 10, the custom trigger created on this table will log the 

ing cache files that contain product or category information following record to the CACHLOG table: 

that has been changed or deleted. The daemon relies on the ('prrfnbr', 10, CURRENT TIMESTAMP) 

records in a specific table called CACHLOO in the Now, when the synchronization daemon accesses the 

Net.Commerce database to identify cache files that contain 65 CACHLOG table, it will discover a new record and will 

product or category information that has been changed or delete all product pages pertaining to the product with 

deleted. reference number "10'\ 



04/09/2004, EAST version: 1.4.1 



us 6,185,608 Bl 



8 



10 



The embodiments of the invention in which an exclusive 
property or privilege is claimed are defined as follows: 

1. A document processing system for transmitting data for 
display on a client madiine from a server, comprising: 

data storage connected to the server, said data storage ^ 
including a trigger mechanism to notify the server of a 
change to the stored data; 

a converter program in the server for transforming data 
from the data storage into transmissible form for dis- 
play on the client machine; 

a cache on the server for storing one or more copies of the 
transformed data in transmissible form; and 

means in the server for checking the cache for a copy of 
the transformed data in transmissible form before acti- 15 
vating the converter program on receiving a request for 
data transmission from the chent wherein said server 
further comprises a synchronized adapter to purge from 
the cache copies of the transformed data affected by the 
change to the stored data, 20 

2. A document processing system, according to claim 1, 
wherein the transmissible form is Hypertext Markup Lan- 
guage, 

3. A document processing system, according to claim 1, 
wherein the data storage is a database. 25 

4. A document processing system, according to claim 3, 
wherein the database is a relational database. 

5. A document processing system, according to claim 1 or 
3, wherein the server further comprises: 

log identifying copies of transformed data affected by the 30 
change to the stored data; and 

a synchronizer adapted to walk the log to identify said 
copies of the transformed data affected by the change to 
the stored data and to purge said identified copies. 

6. A document processing system, according to claim 1, 
wherein the converter program includes means for: 

querying the chent's identity; 

locating preferences corresponding to the client's iden- 
tity; and 

selecting data from the data storage according to the 
located preferences. 

7. A document processing system, according to claim 1, 
wherein the means in the server for checking the cache 
includes means for: 45 

querying the client's identity; 

locating preferences corresponding to the client's iden- 
tity; and 

selecting data from the data storage according to the 
located preferences. 

8. A document processing system for transmitting data for 
display on a client machine from a server, comprising: 



50 



data storage connected to the server, wherein the data 

storage includes a trigger mechanism to notify the 

server of a change in the stored data; 
means for selecting data from the data storage in response 

to receiving a request for data transmission from the 

client; 

a converter program in the server for transforming the 
selected data from the data storage into transmissible 
form for display on the client machine; 

a cache on the server for storing one or more copies of the 
transformed selected data in transmissible form; and 

means in the server for checking the cache for the copy of 
the transformed seleaed data in transmissible form 
before activating the converter program on receiving a 
request for data transmission from the client wherein 
the server further comprises a synchronizer adapted to 
purge from the cache the copy of the transformed 
selected data ff said selected data is affected by the 
change to the stored data. 

9. A document processing system, according to claim 8, 
wherein the transmissible form is Hypertext Markup Lan- 
guage. 

10. A document processing system, according to claim 8, 
wherein the data storage is a database. 

11. A document processing system, according to claim 10, 
wherein the database is a relational database. 

12. A document processing system, according to claim 8 
or 10, wherein the server further comprises: 

a log idcntffying the copy of the transformed selected data 

if said selected data is affected by the change to the 

stored data; and 
a synchronizer adapted to walk the log and to purge the 

transformed selected data if said transformed selected 

data is identified in the log. 

13. A document processing system, according to claim 8, 
wherein the converter program includes means for: 

querying the cUent's identity; 

locating preferences corresponding to the client's iden- 
tity; and 

selecting data from the data storage according to the 
located preferences. 

14. A document processing system, according to claim 8, 
wherein the means in the server for checking the cache 
includes means for: 

querying the cUent's identity; 

locating preferences corresponding to the client's iden- 
tity; and 

selecting data from the data storage according to the 
located preferences. 
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